fact book
Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts
Gupta, Raavi, Panicker, Pranav Hari, Bhatia, Sumit, Ramakrishnan, Ganesh
Large language models (LLMs), despite their remarkable text generation capabilities, often hallucinate and generate text that is factually incorrect and not grounded in real-world knowledge. This poses serious risks in domains like healthcare, finance, and customer support. A typical way to use LLMs is via the APIs provided by LLM vendors where there is no access to model weights or options to fine-tune the model. Existing methods to detect hallucinations in such settings where the model access is restricted or constrained by resources typically require making multiple LLM API calls, increasing latency and API cost. We introduce CONFACTCHECK, an efficient hallucination detection approach that does not leverage any external knowledge base and works on the simple intuition that responses to factual probes within the generated text should be consistent within a single LLM and across different LLMs. Rigorous empirical evaluation on multiple datasets that cover both the generation of factual texts and the open generation shows that CONFACTCHECK can detect hallucinated facts efficiently using fewer resources and achieves higher accuracy scores compared to existing baselines that operate under similar conditions. Our code is available here.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Austria > Vienna (0.14)
- South America > Argentina (0.05)
- (9 more...)
- Research Report (1.00)
- Overview > Fact Book (0.43)
Aligning LLMs for the Classroom with Knowledge-Based Retrieval -- A Comparative RAG Study
Jain, Amay, Cui, Liu, Chen, Si
Large language models like ChatGPT are increasingly used in classrooms, but they often provide outdated or fabricated information that can mislead students. Retrieval Augmented Generation (RAG) improves reliability of LLMs by grounding responses in external resources. We investigate two accessible RAG paradigms, vector-based retrieval and graph-based retrieval to identify best practices for classroom question answering (QA). Existing comparative studies fail to account for pedagogical factors such as educational disciplines, question types, and practical deployment costs. Using a novel dataset, EduScopeQA, of 3,176 questions across academic subjects, we measure performance on various educational query types, from specific facts to broad thematic discussions. We also evaluate system alignment with a dataset of systematically altered textbooks that contradict the LLM's latent knowledge. We find that OpenAI Vector Search RAG (representing vector-based RAG) performs well as a low-cost generalist, especially for quick fact retrieval. On the other hand, GraphRAG Global excels at providing pedagogically rich answers to thematic queries, and GraphRAG Local achieves the highest accuracy with the dense, altered textbooks when corpus integrity is critical. Accounting for the 10-20x higher resource usage of GraphRAG (representing graph-based RAG), we show that a dynamic branching framework that routes queries to the optimal retrieval method boosts fidelity and efficiency. These insights provide actionable guidelines for educators and system designers to integrate RAG-augmented LLMs into learning environments effectively.
- North America > United States > Pennsylvania > Chester County > West Chester (0.04)
- Asia > Philippines (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Research Report (1.00)
- Overview > Fact Book (0.34)
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs
Nguyen, Tin, Bolton, Logan, Taesiri, Mohammad Reza, Nguyen, Anh Totti
An Achilles heel of Large Language Models (LLMs) is their tendency to hallucinate non-factual statements. A response mixed of factual and non-factual statements poses a challenge for humans to verify and accurately base their decisions on. To combat this problem, we propose Highlighted Chain-of-Thought Prompting (HoT), a technique for prompting LLMs to generate responses with XML tags that ground facts to those provided in the query. That is, given an input question, LLMs would first re-format the question to add XML tags highlighting key facts, and then, generate a response with highlights over the facts referenced from the input. Interestingly, in few-shot settings, HoT outperforms vanilla chain of thought prompting (CoT) on a wide range of 17 tasks from arithmetic, reading comprehension to logical reasoning. When asking humans to verify LLM responses, highlights help time-limited participants to more accurately and efficiently recognize when LLMs are correct. Yet, surprisingly, when LLMs are wrong, HoTs tend to make users believe that an answer is correct.
- Europe > Ukraine (0.27)
- Asia (0.27)
- North America > Mexico > Veracruz (0.14)
- (5 more...)
- Research Report (1.00)
- Overview > Fact Book (0.34)
- Leisure & Entertainment > Sports > Football (1.00)
- Health & Medicine > Therapeutic Area (1.00)
FactFlow: Automatic Fact Sheet Generation and Customization from Tabular Dataset via AI Chain Design & Implementation
Vu, Minh Duc, Chen, Jieshan, Xing, Zhenchang, Lu, Qinghua, Xu, Xiwei, Fu, Qian
With the proliferation of data across various domains, there is a critical demand for tools that enable non-experts to derive meaningful insights without deep data analysis skills. To address this need, existing automatic fact sheet generation tools offer heuristic-based solutions to extract facts and generate stories. However, they inadequately grasp the semantics of data and struggle to generate narratives that fully capture the semantics of the dataset or align the fact sheet with specific user needs. Addressing these shortcomings, this paper introduces \tool, a novel tool designed for the automatic generation and customisation of fact sheets. \tool applies the concept of collaborative AI workers to transform raw tabular dataset into comprehensive, visually compelling fact sheets. We define effective taxonomy to profile AI worker for specialised tasks. Furthermore, \tool empowers users to refine these fact sheets through intuitive natural language commands, ensuring the final outputs align closely with individual preferences and requirements. Our user evaluation with 18 participants confirms that \tool not only surpasses state-of-the-art baselines in automated fact sheet production but also provides a positive user experience during customization tasks.
- Media > Film (0.46)
- Information Technology > Security & Privacy (0.46)
Global Economic Impact of AI: Facts and Figures
Wall Street, venture capitalists, technology executives, data scientists -- all have important reasons to understand the growth and opportunity in the artificial intelligence market to access business growth and opportunities. This gives them insights on funds invested in AI and analytics as well potential revenue growth and turnover. Indeed, the growth of AI, continuing research, development of easier open source libraries and applications in small to large scale industries are sure to revolutionize the industry the next two decades and the impact is getting felt in almost all the countries worldwide. To dive deep into the growth of AI and future trends, an insight into the type and size of the market is essential along with (a) AI-related industry market research forecasts and (b) data from reputable research sources for insight into AI valuation and forecasting. IBM's CEO claims a potential $2 trillion dollar market for "cognitive computing").
Artificial Intelligence Fact Sheet - Content Science Review
Content Science is a content strategy and intelligence firm based in Atlanta, GA. Founded in 2010 by Colleen Jones, author of Clout: The Art Science of Influential Web Content, our mission is to transform industries, organizations, and individuals for the better by putting content first. We offer professional services, publications, and software for clients ranging from Fortune 50 companies to nonprofits to government agencies.
Question Answering from Frequently Asked Question Files: Experiences with the FAQ FINDER System
Burke, Robin D., Hammond, Kristian J., Kulyukin, Vladimir, Lytinen, Steven L., Tomuro, Noriko, Schoenberg, Scott
This article describes FAQ FINDER, a natural language question-answering system that uses files of frequently asked questions as its knowledge base. Unlike AI question-answering systems that focus on the generation of new answers, FAQ FINDER retrieves existing ones found in frequently asked question files. Unlike information-retrieval approaches that rely on a purely lexical metric of similarity between query and document, FAQ FINDER uses a semantic knowledge base (WORDNET) to improve its ability to match question and answer. We include results from an evaluation of the system's performance and show that a combination of semantic and statistical techniques works better than any single approach.